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[57] ABSTRACT 

A method and apparatus for analyzing audio input events. A 
template is utilized to analyze audio input events. A speech 
audio input event is identified. The identified speech audio 
input event is recorded. The recorded speech audio input 
event is processed to create a first entry in a template. A 
selected non-speech audio input event which occurs in a 
selected environment is identified. The identified non- 
speech audio input event is recorded. Then the recorded 
non-speech audio input event is processed to create a second 
entry in the template. Thereafter, a speech audio input event 
and a non-speech audio input event is distinguished by 
comparing an audio input event to the template. 

8 Claims, 7 Drawing Sheets 
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METHOD AND APPARATUS FOR SPEECH background noise often initiated by the user. These noises 

RECOGNITION FOR DISTINGUISHING can interfere with the operation of the voice recognition 

NON-SPEECH AUDIO INPUT EVENTS FROM system, i.e.. causing the system to recognize background 

SPEECH AUDIO INPUT EVENTS noises as phrases corresponding to a command or function. 

^ rnm n „^ m ^ 5 The problem of inadvertently selecting a background 

BACKGROUND OF THE INVENTION ^ J\ rccognizable pfarase '„ due ^ ^tyoHnd 

1. Technical Field noise closely mimicking a phrase within the recognizable set 
The present invention relates in general to the field of that is within the voice recognition system's memory. 

speech recognition and in particular to the field of recogni- Therefore, it would be advantageous to have a method and 

tion of unknown phrases. Still more particularly, the present 10 apparatus by which the operation of peripheral devices that 

invention relates to a method and apparatus for speech produce background noise can be recognized as background 

recognition, which takes into account background noises. noise during the recognition mode of the voice recognition 

2. Description of the Related Art system. 

Speech analysis and speech recognition algorithms. 15 SUMMARY OF THE INVENTION 
machines, and devices are becoming more and more com- 
mon. Such systems have become increasingly powerful and 11 ^ one of thc present invention to provide an 
less expensive. Within recent years, an explosion in the use improved method and apparatus for speech recognition, 
of voice recognition systems has occurred. These systems It is another object of the present invention to provide an 
allow a user on a data processing system to employ voice ^ improved method and apparatus for recognition of unknown 
activated commands to direct various programs and appli- phrases. 

cations. One goal of voice recognition systems is to provide ft is yet another object of the present invention to provide 
a more humanistic interface for operating a data processing an improved method and apparatus for speech recognition, 
system, Voice recognition systems, typically, are used with which takes into account background noises, 
other input devices, such as a mouse, keyboard, or printer. 25 present invention provides method and apparatus for 
These input devices often are used to supplement the input/ analyzing audio input events. The present invention utilizes 
output ("I/O") processes of voice recognition systems. Van- a tcmplate t0 analyze audio input events. A speech audio 
ous known voice recognition systems, typically, contain a input event is identified. The identified speech audio input 
set i.e.. a template, of recognizable phrases from which the cvcnt is recorded. The recorded speech audio input event is 
user can speak to use voice activated commands. At any ^ processed to create a first entry in a template. A selected 
instance in time, the voice recognition system's memory non-speech audio input event which occurs in a selected 
contains a recognizable set. This recognizable set contains a environment is identified. The identified non-speech audio 
set of digitized audio phrases from which to choose a input cvcnt ^ recorded. Then the recorded non-speech audio 
recognizable phrase. For example, if 64 trained phrases are input evcnt is processed to create a second entry in the 
within the voice recognition system's memory, the detected 35 template. Thereafter, a speech audio input event and a 
sounds, background or intentional are compared to this non-speech audio input event is distinguished from each 
recognizable set Thus, an unintentional background noise othcr by comparing an audio input event to the template, 
may create a confidence factor that may be interpreted as a wherein the non-speech audio input event is identified, 
recognizable phrase within the set The above as well as additional objectives, features, and 
Typically, the monitoring of an audio environment, causes 40 advantages of the present invention will become apparent in 
the voice recognition system to detect background noises. mc following detailed written description. 
These background noises are often interpreted as user rec- 
ognizable inputs. Such a situation can cause a problem, BRIEF DESCRIPTION OF THE DRAWINGS 
involving the voice recognition system reforming opera- ^ novcl fcamrcs ctaicteiislic of ^ 
tions orcornmands because of background noise Attempts 45 „ set forth m ^ ndtd claims . ^ Mention itself, 
havebcen made to solve this problem mrough the use of howcvcr as wcll as Referred mode of use, further objec- 
cahbrauon techniques. Such a method ? sentially involves ^ ^ advanta ^ CTeof , wm best be understood by 
using the voice recognition system to initially monitor a refcrcQCC t0 ^ fo 5 Uowin description of an Olus- 
bac^oundnoisesample.Thes^ ^ cmbodimcnt wbc * rcad in conjunction with the 
gated factor when the voice recogiuUon system is actually 50 accom panying drawings, wherein: 
listening for recognizable phrases. These calibration tech- *!T . 0 . . . 

niques are often inefficient and often assume the sample of A HG - * 15 a nttUttnedii ^data processing system in accor- 

background noise detected during the caHbration phrase is wth a P*™* embodiment of me present invention; 

identical or similar to the background noise that will exist 2 depicts a block diagram representation of the 

during the actual recognition phase. 55 principal hardware components utilized to execute 

Othcr approaches have allowed the user to manually <WU«*». such as a voice recognition system in accor- 

disable the recognition mode of the voice recognition sys- with a preferred embodiment of the present invention; 

tern. Such an approach, however, requires manual enabling 3isa high level flow chart of a process employed by 

and disabling of the recognition mode when the user sus- a uscr to «»» m appHcation to recognize voice recognition 

pects that the background noise will interfere with the 60 commands in accordance with a preferred einbodiment of 

operation of the voice recognition system. This technique Ac present invention; 

often requires the user to remember which mode the voice FIG. 4 depicts a template illustrated in accordance with a 

recognition system is operating within. Moreover, it can be preferred embodiment of thc present invention; 

extremely cumbersome to enable and disable the voice FIG. 5 is a flow chart of a process for registering periph- 

recognition system. Often the causes of these background 65 eral devices that can create noise or background sounds in 

noises are induced by the user. Peripheral devices, such as accordance with a preferred embodiment of the present 

keyboard sounds and printer sounds, are an example of invention; 



05/16/2003, EAST Version: 1.03.0002 



5.764,852 

3 4 

FIG. 6 depicts a flow chart of a process for training and FIG. 2 is a Week diagram representation of the principal 

updating a voice recognition system to detect and differen- hardware components which are utilized in the present 

tiate between sounds that are speech audio input events or invention to execute multimedia applications which control 

non-speech audio input event; and the operation of multimedia end devices 13. As is conven- 

FIG. 7 is a flowchart of a process for analyzing audio 5 tional "> multimedia data processing operations, a central 

input events in a data processing system in accordance with processing unit (CPU) 33 is provided in computer 15. 

a preferred embodiment of the present invention. Typically, the multimedia application software, such as a 

voice recognition application, is resident in RAM computer 

DETAILED DESCRffTTON OF PREFERRED memory 35. CPU 33 executes the instructions which com- 

EMBODIMENT ^ pri se the multimedia application. Also, as is typical in 

With reference now to the figures and in particular with multimedia data processing operations, digital signal pro- 
reference to FIG. 1. there is depicted multimedia data cessor 37 is provided as an auxiliary processor, which is 
processing system 11 which includes a plurality of multi- dedicated to performing operations on the real-time and/or 
media end devices 13 which are electrically connected to asynchronous streamed data. As is well known to those 
computer 15. Those skilled in the art. will, upon reference to sk uied in the art. digital signal processors are rmexoproces- 
the specification, appreciate that computer 15 may comprise is ^ whkh m l0 rforming operations 

^L^^^T^^ r^^H hvTn^ based upon, or which include, real-time data and are thus 

such as the PS2 IBM Computer manufactured by Interna- V . * * . j - * .. *u 

tional Business Machines Corporation of Armonk. N. Y The <^gned *> * very fast and respond qmcMy to aUow the 

plurality of multimedia end devices 13 include all types of real-time operational nature of the multimedia end devices 

rnultimedia end devices which either produce or consume 20 Typically, in order to speed-up the operation of the digital 

real-time and/or asynchronous streamed data, and include signal processor 37. a conventional direct memory access 

without iirnitation such end and video monitor 25. Each of (DMA) 39 is provided to allow for the rapid fetching and 

these multimedia end devices 13 may be called by multi- storing of data. In the present invention, separate instruction 

media application software to produce or consume the memory (IM) 41 and data memory (DM) 43 are provided to 

streamed data. 25 further speed up the operation of digital signal processor 37. 

For example, the operation of CD-ROM player 17 may be Bus 45 is provided to communicate data between digital 
controlled by multimedia application software which is signal processor 37 and hardware interface 47, which 
resident in. and executed by, computer 15. The real-time includes digital-to-analog and analog-to-digital converters, 
digital data stream generated as an output of CD-ROM Inputs and outputs for the various multimedia end devices 
player 17 may be received and processed by computer 15 in 30 13 m connected through the digital-to-analog (D/A) and 
accordance with instructions of the multimedia application analog-to-digital (A/D) converter 47. In FIG. 2. a telephone 
resident therein. For example, the real-time digital data input/output 49, a microphone input 53. and stereo outputs 
stream may be compressed for storage on a conventional 55. 57 are depicted, in an exemplary manner, and are 
computer floppy disk or for transmission via modem over connected through the A/D and D/A converters in hardware 
ordinary telephone lines for receipt by a remotely located 35 interface 47. MIDI input/output also is connected to hard- 
computer system which may decompress and play the digital ware interface 47 to digital signal processor 37 but is not 
streamed data on analog audio equipment Alternatively, the connected to A/D or D/A converters, 
real-time data stream output from CD-ROM player 17 may Referring next to FIG. 3. a high level flow chart of a 
be received by computer 15, and subjected to digital or process employed by a user to train an application to 
analog filtering, amplification, and sound balancing before 40 recognize voice recognition cornmands is depicted in accor- 
being directed, in analog signal form, to analog stereo dance with a preferred embodiment of the present invention, 
amplifier 29 for output on audio speakers 31 and 33. The user must "train" a set, ie., a template of phrases for 

Microphone 19 may be used to receive analog input recognition, as illustrated in block 300. The user also defines 

signals corresponding to ambient sounds. The real-time a set of actions in the form of macros setting forth predefined 

analog data stream may be directed to computer 15. con- 45 actions, as depicted in block 302 in accordance with a 

verted into digital form, and subject to manipulation by the preferred emfoxliment of the present invention. The user 

multimedia application software, such as a voice recognition then correlates or associates particular phrases to the actions, 

program. The digital data may be stored, compressed, as illustrated in block 304. In other words, the user associ- 

encrypted. filtered, subjected to transforms, outputted in ates a voice phrase or ideal input event to a macro. The user 

analog form to analog stereo amplifier 29, directed as an 50 then loads a template for a particular application, as depicted 

output in analog form to telephone 23, presented in digitized in block 306. Typically during this phase of the process, the 

analog form as an output of a modem for transmission on voice recognition system may encounter background noise, 

telephone lines, transformed into visual images for display This noise may match an entry within the template, i.e., meet 

on video monitor 25. or subjected to a variety of other a confidence factor of an entry within the set of phrases, 

different and conventional multimedia digital signal pro- 55 Also, the recognizable set within the template may not 

cessing operations. perform all commands desired by the user for a particular 

In a similar fashion, the analog and digital inputs and application. For example, the desired voice recognition 

outputs of keyboard 21, telephone 23. and video monitor 25 template may not be currently within the voice recognition 

may be subjected to conventional rnultimedia operations in system's memory. In such cases, the user may issue a voice 

computer 15. In particular, computer 15 may be used as a 60 command to swap templates from the memory. This occurs 

voice recognition system to direct cornmands and functions when the user requires an additional template, as illustrated 

for other applications executing on computer 15. Micro- in block 308. In another situation, the user may require a 

phone 19 may be used to receive speech audio input events. new set of templates to be loaded when the user loads a new 

i.e., human speech, the audio input events may be processed application, as depicted in block 310. 

using a rnultimedia application that is directed towards 65 Hie present invention employs a method and apparatus 

recognizing speech from analyzing inputs from microphone that allows a voice recognition system to automatically 

19. register background noises produced by peripheral devices. 
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The present invention also may automatically enable and 
disable the voice recognition mode based on interrupts from 
the peripheral devices. The present invention involves a 
method and apparatus by which the background interrupt 
noise is not disregarded, but dynamically added to the set of 
recognizable phrases. The registration of a null command 
accompanies the background noise phrase in accordance 
with a preferred embodiment of the present invention. 

Alternatively, a command may be associated with the 
background noise phrase. For example, the command may 
disable the voice recognition system until some other event 
occurs. This approach allows for a dynamic training of the 
voice recognition system for background noise. The system 
may be trained to recognize different background noises, 
which decreases the probability that a background noise will 
be mistaken for a recognizable phrase within the set i.e.. the 
recognizable set now includes a confidence factor for the 
background noise. 

Referring now to FIG. 4. a template 400 is illustrated in 
accordance with a preferred embodiment of the present 
invention. The PHRASE column identifies the textural rep- 
resentation of recognizable phrases. The COMMAND col- 
umn identifies the command, e.g., keyboard macro that will 
be executed upon the voice recognition system recognizing 
the audio phrase. For example, upon the voice recognition 
system, recognizing the phrase "PRINT DOCUMENT 1 , the 
function key F7 will be sent to the keyboard buffer, followed 
by the word DOC. followed by the ENTER key fE) entering 
the keyboard buffer. As a result the application will receive 
the F7 DOC ENTER keystrokes upon the voice recognition 
system recognizing the phrase "PRINT DOCUMENT*. 
These would be the commands necessary for an application 
to print a document The DIGITIZED FORM column shows 
a graphical representation of an audio sample for each 
phrase within the template. The representations are for 
purposes of illustration only and represent an average of 
how the user speaks the particular phrase, i.e., trained 
sample phrases. 

A comparison of the digitized sound form to the digitized 
forms trained by the user within the template is performed 
by the voice recognition system detecting a sound or audio 
input event Upon detecting background noises as defined by 
the interrupt criteria dynamically introduces sound phrases 
into the template. The symbol { } designate entries for 
phrases produced by the invention as can be seen in the 
PHRASE column in FIG. 4. 

Id accordance with a preferred embodiment of the present 
invention, an association of a null command to the entry for 
a created phrase may be made. The voice recognition 
system, upon detecting an audio input event background or 
human voice, compares the sound to all entries within the 
template. A background sound also is referred to a "non- 
speech audio input event" and a human voice sound also is 
referred to as a "speech audio input event". A higher 
confidence factor exists for a comparison of a background 
noise to a background noise sample because the voice 
recognition system compares audio input events to a recog- 
nizable set of entries within the template. 

Referring now to FIG. 5. a flow chart of a process for 
registering peripheral devices that can create noise or back- 
ground sounds (non-speech audio input events) is depicted 
in accordance with a preferred embodiment of the present 
invention. The process begins by receiving user input in the 
farm of names for the peripheral devices, as illustrated in 
block 502. The process men receives user input identifying 
interrupts for each of the peripheral devices, as depicted in 



6 

block 504. Thereafter, user input is received by the process, 
designating associated communications ports for the periph- 
eral devices, as illustrated in block 506. The process then 
receives user input as to the elapsed time for the recognition 

5 of devices, as depicted in block 508. 

Next, user input identifying any optional commands to be 
executed upon recognition are received, as illustrated in 
block 510. The process then receives user input as to the 
notification preferences, as depicted in block 512. A user 

10 may choose to be notified when an appropriate recognition 
is made. The process then determines whether the user 
desires to be notified during detection of noise from periph- 
eral devices, as illustrated in block 514. If the user desires 
notification, the process then receives user input specifying 

13 the output device for notification, as depicted in block 516. 
The user may be notified via various output devices, such as 
a speaker for audio notification or a video monitor for video 
notification. The process then enables the notification flag, 
as illustrated in block 518. 

20 Thereafter, the process terminates after storing the infor- 
mation entered by the user in a device recognition table, as 
depicted in block 520. Referring back to block 514. if the 
user does not desire notification, the process also proceeds 
to block 520. A device recognition table may take various 

25 forms, such as a file continuing field or a relational data base. 
Referring now to FIG. 6. a flow chart of a process for 
training and updating a voice recognition system to detect 
and differentiate between sounds (also called "audio input 

M events") that are speech audio input events or non-speech 
audio input events. The process begins by loading a device 
recognition table into active memory, as illustrated in block 
600. The device recognition table is the data entered by the 
user and stored as illustrated in FIG. 5. The process then sets 

35 the interrupt vectors to intercept interrupts from the periph- 
eral devices designated in the device recognition table 
before they reach the target application, as illustrated in 
block 602. The process then activates a monitoring service, 
as depicted in block 604. A monitoring service used to 

^ monitor for interrupts is well known to those of ordinary 
skill in the art and various methods may be employed in 
accordance with a preferred embodiment of the present 
invention. 

The process then awaits an interrupt from a peripheral 

45 device, as illustrated in block 606. The process then receives 
an interrupt from the peripheral device, as depicted in block 
608. Next, the process passes the interrupt to an existing 
application address to finally deliver to the interrupt to the 
target application, as illustrated in block 610. The process 

50 next marks the time of the reception of the interrupt as 
depicted in block 612. Next, the process starts an expiry 
clock, as illustrated in block 614. An expiry clock is basi- 
cally a timer that is employed in a preferred embodiment of 
the present invention to determine how much time has 

55 passed since the detection of an interrupt 

The process then awaits an audio recognition, as depicted 
in block 616. In other words, the process waits to see if a 
recognizable pattern, a pattern that meets a confidence 
threshold for an entry in the template, is detected. Upon the 

60 recognition of audio, the process then determines whether an 
audio interrupt has been received, as illustrated in block 618. 
An audio interrupt occurs when an input device, such as a 
microphone, detects an audio input event If an audio 
interrupt has not been received, the process then determines 

65 whether the time has expired for recognition, as depicted in 
block 620. if time has expired for recognition, the process 
then clears the mark for the time that the interrupt is 
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received, as illustrated in block 622. with the process then command is associated with the non-speech audio input 

returning to block 606 to await an interrupt from a peripheral event, as depicted in block 716. If a command is associated 

device. Referring again to block 620, if time has not expired with the non-speech audio input event, the command is 

for recognition, the process then returns to block 616 to executed, as illustrated in block 718 with the process ter- 

await an audio recognition. 5 minating thereafter. With reference again to block 716, if a 

Referring again to block 618, if an audio interrupt is command is not associated with the non-speech audio input 

received, the process then proceeds to receive the audio event, the process also terminates. Blocks 714 and 710 both 

pattern (the audio input event), as depicted in Mock 624. The occur in response to an identification of a non-speech audio 

process marks the time of reception of the audio pattern, as m pu t evem. 

illustrated in block 626. Thereafter, the process determines l0 fa accordancc witn a preferred embodiment of the present 
whether the audio pattern is recognizable, as depicted in invention> mc depicted in FIG. 6 is implemented as 
block 628. If the audio pattern is not recognizable, the a tennm ate and stay resident CTSR") service mat mtercepts 
process then proceeds to clear the mark for the tune of ^ imerrupts regis tered by the user or some other entity, 
reception of the audio pattern as mustrated in block 630. sudl as me administrator of the application. Interrupts from 
-thereafter, the process proceeds to block 622 as described 15 ^ ^ces immediately sent to their associ- 
ab °ve. ated interrupt vector table addresses, i.e., the interrupt trans- 
Referring again to block 628. if the audio pattern is to me appropriate device. This ensures that the 
recognizable, the process then subtracts the interrupt time keyboard interrupt service receives the keyboard interrupt 
for the peripheral device from the audio interrupt time to and me printer services receive their output The detectable 
determine an elapsed period of time, as depicted in block 20 vo ice recognition phrases which meet no confidence factor 
632. The process next determines whether the time period wimm ^ template, but are received within the designated 
calculated is within an acceptable range, as depicted in block lime of m interrupt are candidates of null associates in 
634. If the time period is not within an acceptable range, the accordance with a preferred embodiment of the present 
process proceeds to block 630 as described previously. On invention. Each peripheral device has an associated interrupt 
the other hand, if the period of time is within the acceptable 25 defined for it For example, a personal computer may use 
range, the process then determines whether the user is to be interrupt 14H for a keyboard interrupt 
notified of the w^**^™*^ The present invention may be directed to intercept hard- 
event, as depicted in block 636 If me user is to be noUfied ^ ^ or itAmx ^ of me operating system. The 
the process then determines whether commands are to be r( ^ stration msX ^ d * fi G . 5 allows a user to 
executed, as illustrated in block 638. 30 s * mc upon which recording should be 
Referring back to block 636. if the user is to be notified. actrvated for ^ aud i 0 input event The user may adjust the 
the process then proceeds to notify the user acccrdmg the sensitivity at which an interrupt should be interpreted as 
recognition table definitions, as illustrated in block 640. backgrouod noise or a non-speech audio input event Pre- 
Thereafter. the process also determines whether commands ^ ^ sct for existing devices, e.g., print- 
are to be executed, as depicted in block 638 If commands 35 er$ noimaU operatc on inteirU pt 5H. 
are to be executed, the ^^^^f n commands ^dance with a preferred embodiment of the present 
according to tiie recogntfion table as depicted ^ block 64£ pre-process may be employed to evaluate if the 
Thereafter, the noise (non-speech audio input event) is . r ^ _ ^ \. . . u ' , , _ „. t 
~~ . <, # *7~ „ _ „;-i!,.^ nt J ;„ audio input event detected should be compared to an exist- 
stored as a recognizable template pattern, as illustrated m ; , i_ 

. \ . 7^ u • + ing null phrase or to create a new phrase. Such a process may 

block 644. In other words, the non-speech audio input event 40 . & , £ . . * - . . f" . „. ^ ' 

. ™ TTL^ CC involve the continuous employment of background noise to 

is stored as an en^ mthe train the system for a particular noise phrase. In addition, 

proceeds o block 630 as previously ^scribed. Refemng ^substituted foYuser supplied or 

acain to block 638, if commands are not to be executed, the J _ ' , AZT^^a* 

^ « system default commands. For example, the commands 

process proceeds directly to block 644. with reference now , , . . £ 

*^Z~Z £ « , / * , « « . - could issue a SAVE command for a word processor. With an 

to FIG. 7. a flowchart of a process for analyzing audio input 45 ; . . . . , . . . F M , „ ttlf 

^m a data processing systemis depicted ii accordance ™!** « noise acuvity such as mterferenc* utta .user may 

with a preferred I emrx>diinWt of the present invention. Tne ***** to savc *f WQ * P™* SU *J 

^»teg£. by identifying and receding a speech audio ~n, one of the background sounds matches an execut- 

input event, as depicted in block 700. Next the recorded awepiirase. 

speech audio input event is processed to create a first entry 50 The user also may be graphically notified when nuU 

iTa template, as illustrated in block 702. Tne process then associations are created. Such a notification also can be 

identifies and records a non-speech audio input event as niade through audio means. Moreover, the user may be 

depicted in block 704. The recorded non-speech audio input *Uowed to modify null commands upon notification of 

event is men processed to create a second entry in the creation. 

template, as illustrated in block 706, Next the process 55 The present invention also may allow for entire template 

detects an interrupt as depicted in block 708. Then, an audio switching based upon the type of interrupt received, rather 

input event is detected after the detection of an interrupt as than a null association. Such an option would signify mat the 

illustrated in block 710. The process then identifies a non- interrupt detected may be a rjceminent signal to a new 

speech audio input event by comparing the detected audio application that employs a different set of voice recognition 

input event with the template, as depicted in block 712. 60 phrases, requiring a new set of templates. 

After the expiration of a period of time after the interrupt In accordance with a preferred embodiment of the present 

the identified non-speech audio input event is processed to invention, the fundamental problem of voice recognition 

replace the second entry in the template for the processed systems involving differentiation of non-speech audio input 

non-speech audio input event, as illustrated in block 714. events from speech audio input events is addressed. The 
Additionally, in response to identifying a non-speech audio 65 present invention recognizes that peripheral devices may 
input event a determination is made after the expiration of produce background noise and that a system may be allowed 

a period of time after the interrupt occurs as to whether a to essentially execute commands that are irrelevant to the 
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applications in response to this background noise. The 
present invention provides a method and apparatus for 
allowing peripheral devices to affix digitized sound phrases 
within voice recognition sets, such as templates. The present 
invention provides further advantage over prior art methods 5 
in that the present invention does not need to always be 
activated. Once, a background noise is 'trained" and regis- 
tered into the template, the invention may be disabled or 
removed. This provides an advantage of freeing computer 
resources for other applications. 10 

While the invention has been particularly shown and 
described with reference to a preferred embodiment, it will 
be understood by those skilled in the art that various changes 
in form and detail may be made therein without departing 
from the spirit and scope of the invention. 15 

I claim: 

1. A method for analyzing audio input events in a data 
processing system, wherein said data processing system 
utilizes a template to analyze audio input events, wherein 
said data processing system includes a peripheral device that 20 
generates said audio input event and an interrupt said 
method comprising the steps of: 

identifying a speech audio input event; 

recording said identified speech audio input event; 

processing said recorded speech audio input event to 
create a first entry in a template; 

identifying a selected non-speech audio input event which 
occurs in a selected environment; 

recording said identified non-speech audio input event; ^ 

processing said recorded non-speech audio input event to 
create a second entry in said template; and 

thereafter, distinguishing between a speech audio input 
event and a non-speech audio input event by comparing 
said audio input event to said template in response to 35 
detecting said interrupt and detecting said audio input 
event within a preselected amount of time wherein said 
non-speech audio input event is identified. 

2. The method of claim 1. further comprising: 
deterrriining whether a command is associated with said 40 

non-speech audio input event in response to identifi- 
cation said non-speech audio input event; and 
responsive to said command being associated with said 
non-speech audio input event, executing said com- 
mand. 45 

3. A method for analyzing audio input events in a data 
processing system, wherein said data processing system 
utilizes a template to analyze audio input events and wherein 
said data processing system includes a peripheral device that 
generates an audio input event and an interrupt, said method 50 
comprising the steps of: 

identifying a speech audio input event; 

recording said identified speech audio input event; 

processing said recorded speech audio input event to 
create a first entry in a template; 53 

identifying a selected non-speech audio input event which 
occurs in a selected environment; 

recording said identified non- speech audio input event; 

processing said recorded non-speech audio input event to ^ 
create a second entry in said template for said pro- 
cessed non-speech audio input event; 

detecting an interrupt; 

detecting said audio input event, wherein said audio input 
event occurs after said interrupt; ss 

identifying a non-speech audio input event by comparing 
an audio input event to said template; 



responsive to identifying a non-speech audio input event 
occurring a preselected amount of time after said 
interrupt occurs, determining whether a command is 
associated with said non-speech audio input event; and 

executing said command in response to said command 
being associated with said non- speech audio input 
event 

4. The method of claim 3 further comprising processing 
said identified non-speech audio input event occurring said 
preselected amount of time after said interrupt occurs to 
replace said second entry in said template for said processed 
non-speech audio input event 

5. An apparatus for analyzing audio input events, wherein 
said utilizes a template to analyze audio input events, 
wherein apparatus includes a peripheral device that gener- 
ates an audio input event and an interrupt said apparatus 
comprising: 

first identification means for identifying a speech audio 
input event; 

first recording means for recording said identified speech 

audio input event; 
first processing means for processing said recorded 

speech audio input event to create a first entry in a 

template; 

second identification means for identifying a selected 
non-speech audio input event which occurs in a 
selected environment; 

second recording means for recording said identified 
non-speech audio input event; 

second processing means for processing said recorded 
non-speech audio input event to create a second entry 
in said template for said processed non-speech audio 
input event; and 

comparison means for distinguishing between a speech 
audio input event and a non-speech audio input event 
by comparing said audio input event to said template in 
response to detecting said interrupt and detecting said 
audio input event within a preselected amount of time, 
wherein said non-speech audio input events may be 
efficiently distinguished from speech audio input 
events. 

6. The apparatus of claim 5. further comprising: 
means for determining whether a command is associated 

with said non-speech audio input event in response to 
identification of said non-speech audio input event; and 
responsive to said command being associated with said 
non- speech audio input event means for executing said 
command. 

7. An apparatus method for analyzing audio input events, 
where said apparatus utilizes a template to analyze audio 
input events and wherein said data processing system 
includes a peripheral device that generates an audio input 
event and an interrupt said apparatus comprising: 

first identification means for identifying a speech audio 
input event; 

first recording means for recording said identified speech 

audio input event; 
first processing means for processing said recorded 

speech audio input event to create a first entry in a 

template; 

second identification means for identifying a selected 
non-speech audio input event which occurs in a 
selected environment; 

second recording means for recording said identified 
non-speech audio input event; 



05/16/2003, EAST version: 1.03.0002 



5.764J 

11 

second processing means for processing said recorded 
non-speech audio input event to create a second entry 
in said template for said processed non-speech audio 
input event; 

first detection means for detecting an interrupt; 

second detection means for detecting said audio input 
event, wherein said audio input event occurs after said 
interrupt; 

third identification means for identifying a non-speech 
audio input event by comparing an audio input event to 
said template; 

determination means, responsive to identifying a non- 
speech audio input event occurring a preselected 



12 

amount of time after said interrupt occurs, for deter- 
mining whether a command is associated with said 
non- speech audio input event; and 
execution means for executing said command in response 
to said command being associated with said non-speech 
audio input event 
8. The apparatus of claim 7 further comprising means for 
processing said identified non-speech audio input event 
occurring said preselected amount of time after said inter- 
rupt occurs to replace said second entry in said template for 
said processed non- speech audio input event 

***** 
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